Multimodal Corpus of Speech Production: Work in Progress
نویسندگان
چکیده
The paper introduces work-in-progress on multimodal articulatory data collection involving multiple instrumental techniques such as electrolaryngography (EGG), electropalatography (EPG) and electromagnetic articulography (EMA). The data is recorded from two native Estonian speakers (one male and one female), the target amount of the corpus is approximately one hour of speech from both subjects. In the paper the instrumental systems exploited for data collection and recording set-ups are introduced, examples of multimodal data analysis are given and the possible use of the corpus is discussed.
منابع مشابه
Evaluating Factors Impacting the Accuracy of Forced Alignments in a Multimodal Corpus
People, when processing human-to-human communication, utilize everything they can in order to understand that communication, including speech and information such as the time and location of an interlocutor’s gesture and gaze. Speech and gesture are known to exhibit a synchronous relationship in human communication; however, the precise nature of that relationship requires further investigation...
متن کاملSteps toward Flexible Speech Recognition – Recent Progress at Tokyo Institute of Technology –
This paper describes recent progress at Tokyo Institute of Technology and the author’s perspectives for making speech recognition systems more flexible at both the acoustic and linguistic processing levels. Specifically, it describes a broadcast news transcription system, a multimodal dialogue system for information retrieval, neural-network-based HMM adaptation for noisy speech, online increme...
متن کاملThe REPERE challenge: finding people in a multimodal context
The REPERE Challenge aims to support research on people recognition in multimodal conditions. To assess the technology progress, annual evaluation campaigns will be organized from 2012 to 2014. In this context, the REPERE corpus, a French video corpus with multimodal annotation, has been developed. The systems which participated in the dry run had to answer the following questions : Who is spea...
متن کاملA Corpus of Natural Multimodal Spatial Scene Descriptions
We present a corpus of multimodal spatial descriptions, as commonly occurring in route giving tasks. Participants provided natural spatial scene descriptions with speech and abstract deictic/iconic hand gestures. The scenes were composed of simple geometric objects. While the language denotes object shape and visual properties (e.g., colour), the abstract deictic gestures “placed” objects in ge...
متن کاملA Multimodal Real-Time MRI Articulatory Corpus for Speech Research
We present MRI-TIMIT: a large-scale database of synchronized audio and real-time magnetic resonance imaging (rtMRI) data for speech research. The database currently consists of speech data acquired from two male and two female speakers of American English. Subjects’ upper airways were imaged in the midsagittal plane while reading the same 460 sentence corpus used in the MOCHA-TIMIT corpus [1]. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012